Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Sparse Discriminative Information Preservation for Chinese character font categorization

Identifieur interne : 000122 ( Main/Exploration ); précédent : 000121; suivant : 000123

Sparse Discriminative Information Preservation for Chinese character font categorization

Auteurs : DAPENG TAO [République populaire de Chine] ; LIANWEN JIN [République populaire de Chine] ; SHUYE ZHANG [République populaire de Chine] ; ZHAO YANG [République populaire de Chine] ; YONGFEI WANG [République populaire de Chine]

Source :

RBID : Pascal:14-0129513

Descripteurs français

English descriptors

Abstract

With the rapid development of optical character recognition (OCR), font categorization becomes more and more important. This is because font information has very wide usage and researchers came to know this point recently. In this paper, we propose a new scheme for Chinese character font categorization (CCFC), which applies LBP descriptor based Chinese character interesting points for representing font information. Specifically, it classifies Chinese character font through the cooperation between a new Sparse Discriminative Information Preservation (SDIP) for feature selection and NN classifier. SDIP focus three aspects as follows: (1) it preserves the local geometric structure of the intra-class samples and maximizes the margin between the inter-class samples on the local patch simultaneously; (2) it models the reconstruction error to preserve the prior information of the data distribution; and (3) it introduces the L1-norm penalty to achieve the sparsity of the projection matrix. We conduct experiments on our new collect text block images which include 25 popular Chinese fonts. The average recognition demonstrates the robustness and effectiveness of SDIP for CCFC.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Sparse Discriminative Information Preservation for Chinese character font categorization</title>
<author>
<name sortKey="Dapeng Tao" sort="Dapeng Tao" uniqKey="Dapeng Tao" last="Dapeng Tao">DAPENG TAO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Lianwen Jin" sort="Lianwen Jin" uniqKey="Lianwen Jin" last="Lianwen Jin">LIANWEN JIN</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Shuye Zhang" sort="Shuye Zhang" uniqKey="Shuye Zhang" last="Shuye Zhang">SHUYE ZHANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhao Yang" sort="Zhao Yang" uniqKey="Zhao Yang" last="Zhao Yang">ZHAO YANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Yongfei Wang" sort="Yongfei Wang" uniqKey="Yongfei Wang" last="Yongfei Wang">YONGFEI WANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">14-0129513</idno>
<date when="2014">2014</date>
<idno type="stanalyst">PASCAL 14-0129513 INIST</idno>
<idno type="RBID">Pascal:14-0129513</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000021</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000743</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000005</idno>
<idno type="wicri:doubleKey">0925-2312:2014:Dapeng Tao:sparse:discriminative:information</idno>
<idno type="wicri:Area/Main/Merge">000123</idno>
<idno type="wicri:Area/Main/Curation">000122</idno>
<idno type="wicri:Area/Main/Exploration">000122</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Sparse Discriminative Information Preservation for Chinese character font categorization</title>
<author>
<name sortKey="Dapeng Tao" sort="Dapeng Tao" uniqKey="Dapeng Tao" last="Dapeng Tao">DAPENG TAO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Lianwen Jin" sort="Lianwen Jin" uniqKey="Lianwen Jin" last="Lianwen Jin">LIANWEN JIN</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Shuye Zhang" sort="Shuye Zhang" uniqKey="Shuye Zhang" last="Shuye Zhang">SHUYE ZHANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhao Yang" sort="Zhao Yang" uniqKey="Zhao Yang" last="Zhao Yang">ZHAO YANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Yongfei Wang" sort="Yongfei Wang" uniqKey="Yongfei Wang" last="Yongfei Wang">YONGFEI WANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Electronic and Information Engineering, South China University of Technology</s1>
<s2>Guangzhou 510640</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Jiangmen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Neurocomputing : (Amsterdam)</title>
<title level="j" type="abbreviated">Neurocomputing : (Amst.)</title>
<idno type="ISSN">0925-2312</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Neurocomputing : (Amsterdam)</title>
<title level="j" type="abbreviated">Neurocomputing : (Amst.)</title>
<idno type="ISSN">0925-2312</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Block code</term>
<term>Categorization</term>
<term>Character recognition</term>
<term>Chinese</term>
<term>Classification</term>
<term>Computer vision</term>
<term>Data distribution</term>
<term>Dimension reduction</term>
<term>Geometrical model</term>
<term>Ideogram</term>
<term>L1 approximation</term>
<term>Local structure</term>
<term>Modeling</term>
<term>Optical character recognition</term>
<term>Prior distribution</term>
<term>Prior information</term>
<term>Robustness</term>
<term>Selection criterion</term>
<term>Sparse representation</term>
<term>Text</term>
<term>User behavior</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Classification</term>
<term>Information a priori</term>
<term>Texte</term>
<term>Chinois</term>
<term>Vision ordinateur</term>
<term>Idéogramme</term>
<term>Catégorisation</term>
<term>Comportement utilisateur</term>
<term>Critère sélection</term>
<term>Structure locale</term>
<term>Modèle géométrique</term>
<term>Modélisation</term>
<term>Loi a priori</term>
<term>Distribution donnée</term>
<term>Approximation L1</term>
<term>Robustesse</term>
<term>Réduction dimension</term>
<term>Représentation parcimonieuse</term>
<term>Code bloc</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">With the rapid development of optical character recognition (OCR), font categorization becomes more and more important. This is because font information has very wide usage and researchers came to know this point recently. In this paper, we propose a new scheme for Chinese character font categorization (CCFC), which applies LBP descriptor based Chinese character interesting points for representing font information. Specifically, it classifies Chinese character font through the cooperation between a new Sparse Discriminative Information Preservation (SDIP) for feature selection and NN classifier. SDIP focus three aspects as follows: (1) it preserves the local geometric structure of the intra-class samples and maximizes the margin between the inter-class samples on the local patch simultaneously; (2) it models the reconstruction error to preserve the prior information of the data distribution; and (3) it introduces the L1-norm penalty to achieve the sparsity of the projection matrix. We conduct experiments on our new collect text block images which include 25 popular Chinese fonts. The average recognition demonstrates the robustness and effectiveness of SDIP for CCFC.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
<region>
<li>Guangdong</li>
</region>
<settlement>
<li>Jiangmen</li>
</settlement>
</list>
<tree>
<country name="République populaire de Chine">
<region name="Guangdong">
<name sortKey="Dapeng Tao" sort="Dapeng Tao" uniqKey="Dapeng Tao" last="Dapeng Tao">DAPENG TAO</name>
</region>
<name sortKey="Lianwen Jin" sort="Lianwen Jin" uniqKey="Lianwen Jin" last="Lianwen Jin">LIANWEN JIN</name>
<name sortKey="Shuye Zhang" sort="Shuye Zhang" uniqKey="Shuye Zhang" last="Shuye Zhang">SHUYE ZHANG</name>
<name sortKey="Yongfei Wang" sort="Yongfei Wang" uniqKey="Yongfei Wang" last="Yongfei Wang">YONGFEI WANG</name>
<name sortKey="Zhao Yang" sort="Zhao Yang" uniqKey="Zhao Yang" last="Zhao Yang">ZHAO YANG</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000122 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000122 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:14-0129513
   |texte=   Sparse Discriminative Information Preservation for Chinese character font categorization
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024